NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

ART: Customizing Accelerators for DNN-Enabled Real-Time Safety-Critical Systems

https://doi.org/10.1145/3716368.3735215

Ji, Shixin; Chen, Xingzhen; Zhuang, Jinming; Zhang, Wei; Yang, Zhuoping; Schultz, Sarah; Song, Yukai; Hu, Jingtong; Jones, Alex; Dong, Zheng; et al (June 2025, ACM)

Real-time systems are widely applied in different areas like autonomous vehicles, where safety is the key metric. However, on the FPGA platform, most of the prior accelerator frameworks omit discussing the schedulability in such real-time safety-critical systems, leaving deadlines unmet, which can lead to catastrophic system failures. To address this, we propose the ART framework, a hardware-software co-design approach that transforms baseline accelerators into “real-time guaranteed" accelerators. On the software side, ART performs schedulability analysis and preemption point placement, optimizing task scheduling to meet deadlines and enhance throughput. On the hardware side, ART integrates the Global Earliest Deadline First (GEDF) scheduling algorithm, implements preemption, and conducts source code transformation to transform baseline HLS-based accelerators into designs targeted for real-time systems capable of saving and resuming tasks. ART also includes integration, debugging, and testing tools for full-system implementation. We demonstrate the methodology of ART on two kinds of popular accelerator models and evaluate on AMD Versal VCK190 platform, where ART meets schedulability requirements that baseline accelerators fail. ART is lightweight, utilizing <0.5% resources. With about 100 lines of user input, ART generates about 2.5k lines of accelerator code, making it a push-button solution.
more » « less
Free, publicly-accessible full text available June 29, 2026
Towards Accelerator Customization in Real-time Safety-critical Systems

https://doi.org/10.1145/3706628.3708841

Ji, Shixin; Chen, Xingzhen; Zhang, Wei; Yang, Zhuoping; Zhuang, Jinming; Schultz, Sarah; Song, Yukai; Hu, Jingtong; Jones, Alex K; Dong, Zheng; et al (February 2025, ACM)

Free, publicly-accessible full text available February 27, 2026
CHEF: A Framework for Deploying Heterogeneous Models on Clusters With Heterogeneous FPGAs

https://doi.org/10.1109/TCAD.2024.3438994

Tang, Yue; Song, Yukai; Elango, Naveena; Priya, Sheena Ratnam; Jones, Alex K; Xiong, Jinjun; Zhou, Peipei; Hu, Jingtong (November 2024, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems)

Full Text Available
CHEF: A Framework for Deploying Heterogeneous Models on Clusters with Heterogeneous FPGAs

Tang, Yue; Song, Yukai; Elango, Naveena; Priya, Sheena R; Jones, Alex K; Xiong, Jinjun; Zhou, Peipei; Hu, Jingtong (October 2024, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS)

Full Text Available
CHEF: A Framework for Deploying Heterogeneous Models on Clusters with Heterogeneous FPGAs

Tang, Yue; Song, Yukai; Elango, Naveena; Priya, Sheena R; Jones, Alex K; Xiong, Jinjun; Zhou, Peipei; Hu, Jingtong (October 2024, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS)

DNNs are rapidly evolving from streamlined singlemodality single-task (SMST) to multi-modality multi-task (MMMT) with large variations for different layers and complex data dependencies among layers. To support such models, hardware systems also evolved to be heterogeneous. The heterogeneous system comes from the prevailing trend to integrate diverse accelerators into the system for lower latency. FPGAs have high computation density and communication bandwidth and are configurable to be deployed with different designs of accelerators, which are widely used for various machinelearning applications. However, scaling from SMST to MMMT on heterogeneous FPGAs is challenging since MMMT has much larger layer variations, a massive number of layers, and complex data dependency among different backbones. Previous mapping algorithms are either inefficient or over-simplified which makes them impractical in general scenarios. In this work, we propose CHEF to enable efficient implementation of MMMT models in realistic heterogeneous FPGA clusters, i.e. deploying heterogeneous accelerators on heterogeneous FPGAs (A2F) and mapping the heterogeneous DNNs on the deployed heterogeneous accelerators (M2A). We propose CHEF-A2F, a two-stage accelerators-toFPGAs deployment approach to co-optimize hardware deployment and accelerator mapping. In addition, we propose CHEFM2A, which can support general and practical cases compared to previous mapping algorithms. To the best of our knowledge, this is the first attempt to implement MMMT models in real heterogeneous FPGA clusters. Experimental results show that the latency obtained with CHEF is near-optimal while the search time is 10000X less than exhaustively searching the optimal solution.
more » « less
Full Text Available
GPU Partitioning & Neural Architecture Sizing for Safety-Driven Sensing in Autonomous Systems

https://doi.org/10.1109/ICAA64256.2024.00018

Xu, Shengjie; Hobbs, Clara; Song, Yukai; Ghosh, Bineet; Zhu, Tingan; Aktar, Sharmin; Yang, Lei; Sheng, Yi; Jiang, Weiwen; Hu, Jingtong; et al (October 2024, IEEE)

Full Text Available
Poster Abstract: Neural Architecture Sizing for Autonomous Systems

https://doi.org/10.1109/ICCPS61052.2024.00040

Xu, Shengjie; Hobbs, Clara; Song, Yukai; Ghosh, Bineet; Aktar, Sharmin; Yang, Lei; Sheng, Yi; Jiang, Weiwen; Hu, Jingtong; Duggirala, Parasara Sridhar; et al (May 2024, IEEE)

Full Text Available

Search for: All records